Identifying more bloggers: Towards large scale personality classification of personal weblogs

نویسندگان

  • Scott Nowson
  • Jon Oberlander
چکیده

We report new results on the relatively novel task of automatic classification of blog author personality. Promisingly high classification accuracies have recently been reported for four important personality traits (Extraversion, Neuroticism, Agreeableness and Conscientiousness). But the blog corpus used in that work required careful preparation, and was consequently quite small (with less than a hundred authors; and less than half a million words). Here, we provide an initial report on the classification accuracies that can be achieved when classifiers conditioned on the small corpus are applied to a larger, automatically-acquired blog corpus, using lowergranularity personality data and substantially less manual preparation (with over a thousand bloggers, and approximately five million words). Predictably, results on the larger corpus are not as impressive as those on the smaller; nevertheless, they point the way forward for further work.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

What Are They Blogging About? Personality, Topic and Motivation in Blogs

Personal weblogs (blogs), provide individuals with the opportunity to write freely and express themselves online in the presence of others. In such situations, what do bloggers write about, and what are their motivations for blogging? Using a large blog corpus annotated with the LIWC text analysis program, we examine the content of blogs to provide insight into the role of personality in motiva...

متن کامل

The Identity of Bloggers: Openness and Gender in Personal Weblogs

Work has recently been completed on a PhD Thesis concerning individual difference in the language of personal weblogs (Nowson 2005). This paper highlights some of the results. Blogs are increasingly used as a resource for academic study, as evidenced by this symposium. Bloggers are not, however, representative of the population as a whole: they are more likely to be teenage or 20-something fema...

متن کامل

Identifying Bloggers' Residential Areas

This paper proposes a method to infer bloggers’ residential areas. Identifying bloggers’ residential areas will be useful as another axis to retrieve weblogs or for tasks that resolve ambiguous objects in terms of geographic contexts. Our method focuses on the local context of geographic location terms and uses binary classifiers to decide whether the context is indicating the writer’s resident...

متن کامل

A Study on the Perception of Students towards Educational Weblogs

Weblogs are a popular form of easy-to-use personal publishing that has attracted millions of bloggers to share their personal thoughts, opinions, and knowledge on the web. The versatility of weblogs as a communication medium has attracted interests from educators. Educational applications of weblogs have so far included journals, e-portfolio, learning diaries, and logbooks. As in the case of ot...

متن کامل

Capturing Global Mood Levels using Blog Posts

The personal, diary-like nature of blogs prompts many bloggers to indicate their mood at the time of posting. Aggregating these indications over a large amount of bloggers gives a “blogosphere state-of-mind” for each point in time: the intensity of different moods among bloggers at that time. In this paper, we address the task of estimating this state-of-mind from the text written by bloggers. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007